HTML Handbook

Introduction to HTML

What is HTML?

Hypertext Markup Language (HTML) is a system for marking up documents with tags that indicate how text in the documents should be presented and how the documents are linked together. Hypertext links are quite powerful. Within the HTML markup scheme lies the power to create interactive, cross-platform, multimedia, client-server applications. This string of adjectives is not just hype; such systems do exist. One, called the World Wide Web (WWW) lives on the Internet, providing organization to a wide variety of computer resources around the globe.

HTML is not a programming language and an HTML document is not a computer program. It's a lot simpler than that. The definition of HTML specifies the grammar and syntax of markup tags that, when inserted into data, instruct browsers -- computer programs that read HTML documents -- how to present the document.

Standard Generalized Markup Language

Technically HTML is defined as a Standard Generalized Markup Language (SGML) Document Type Definition (DTD. An HTML document is said to be an instance of an SGML document.

SGML originated as GML (general Markup Language) at IBM in the late 1960s as an attempt to solve some of the problems transporting documents across different computer systems. The term markup comes from the publishing industry. SGML is generalized, meaning that instead of specifying exactly how to present a document, it describes document types, along with markup languages to format and present instances of each type. GML became SGML when it was accepted as a standard bye the International Standard Organization (ISO) in Geneva, Switzerland (reference numbers ISO 8879:1986).

The Three Parts to an SGML Document

An SGML document has three parts. The first describes the character set and, most importantly, which characters are used to differentiate the text from the markup tags. The second part declares the document type and which markup tags are accepted as legal. The third part is called the document instance and contains the actual text and markup tags. The three parts need not be int he smane physical file, which is a good thing because it allows us to forget about SGML and deal only with HTML. All HTML browsers assume the smae information for the SGMl character-set and document-type declarations, so we only have to work with HTML document instances -- simple text fiels.

Base character set

The base character set of an HTML document is Latin-1 (ISO 8859/1). It's an 8-bit alphabet with characters for most American and European languages. Plain old ASCII (ISO 646) is a 7-bit subset of Latin-1. There is no obligation to use anything but the 128 standard ASCII characters in an HTML document. In fact, sticking to straight ASCII is encouraged as it allows an HTML document to be edited by any text editor on any Computer System and be transported over any network by even the most rudimentary of e-mail and data transport systems. To make this possible HTML includes character entities for most of the commonly used non-ADCII Latin-1 characters. These character entities begin with the ampersand character (&), followed by the name or number of the character, followed by a semicolon.

HTML markup tags are delimited by the angle brackets, < and >. They appear either singularly, like the tag <P> to indicate a paragraph break in the text, or as a pair of starting and ending tags that modify the content contained.

That's all there is to Hypertext Markup Language -- character entities and markup tags. However, this sytem of entites an tages is evolving. There are currently several standardization levels of HTML.

HTML levels

Level 1 is the level mandatory for all WWW browsers. It is essentially what was accepted by the first browsers (level 0), plus images.

Level 2 includes all the elements of level 1, plus tags for defining user input fields. This is currently the standard although many browsers already support level 3 elements.

Level 3, also know as HTML 3, is being finalized. It includes markup tags for objects such as tables, figures, and mathematical equations.

HTML Pointers for the Beginner

Common Terms

WWW: World Wide Web

Web:World Wide Web

SGML: Standard Generalized Markup Language--a standard for describing markup languages

DTD: Document Type Definition--this is the formal specification of a markup language, written using SGML

HTML: HyperText Markup Language. HTML is a collection of platform-independent styles (indicated by markup tags) that define the various components of a World Wide Web document. HTML was invented by Tim Berners-Lee while at CERN, the European Laboratory for Particle Physics in Geneva.

What Is an HTML Document?

HTML documents are text (ASCII) files that can be created using any text editor (e.g., Emacs or vi on UNIX machines; BBEdit on a Macintosh; Notepad on a Windows machine). You can also use word-processing software if you save your document as "text only with line breaks."

HTML Editors

Some WYSIWYG editors are available (e.g., HotMetal, which is available for several platforms or Adobe PageMill for Macintoshes). You may wish to try one of them after you learn some of the basics of HTML tagging. It is useful to know enough HTML to code a document before you determine the usefulness of a WYSIWYG editor.

If you haven't already selected your software, refer to an online listing of HTML editors (organized by platform) to help you in your search for appropriate software.

Getting Your Files on a Server

If you have access to a Web server at school or work, contact your webmaster (the individual who maintains the server) to see how you can get your files on the Web. If you do not have access to a server at work or school, check to see if your community operates a FreeNet, a community-based network that provides free access to the Internet. You may need to contact a local Internet provider that will post your files on a server for a fee.

The Bare Bones HTML Document

Teaching Tools

Every HTML document should contain certain standard HTML tags. Each document consists of head and body text. The head contains the title, and the body contains the actual text that is made up of paragraphs, lists, and other elements. Browsers expect specific information because they are programmed according to HTML and SGML specifications.

Required elements are shown in this sample:

<html>

<head>

<TITLE>This is an HTML Example</TITLE>

</head>

<body>

<H1>Learn HTML from the Pros</H1>

<P>This is the first paragraph. </P>

<P>This is the second paragraph.</P>

</body>

</html>

The required elements are the <html>, <head>, <title>, and <body> tags (and their corresponding end tags). Because you should include these tags in each file, you might want to create a template file with them. (Some browsers will format your HTML file correctly even if these tags are not included. But some browsers won't! So make sure to include them.)

Teaching Tools

To see a copy of the file that your browser reads to generate the information in your current window, select View Source (or the equivalent) from the browser menu. The file contents, with all the HTML tags, are displayed in a new window.

This is an excellent way to see how HTML is used and to learn tips and constructs. Of course, the HTML might not be technically correct. Once you become familiar with HTML and check the many online and hard-copy references on the subject, you will learn to distinguish between "good" and "bad" HTML.

Remember that you can save a source file with the HTML codes and use it as a template for one of your Web pages or modify the format to suit your purposes.

Page Structure

A file containing the marked up text of a Web page is called an HTML file. It begins and ends with the tags <HTML> and </HTML>. It is divided into two parts, head and a body. The head contains information about the document and the body contains the text of the document. Markup tags are used to define the two parts, as in this minimal HTML file:

<HTML>

<HEAD>

<TITLE.>Minimal HTML Page</TITLE>

</HEAD>

<BODY>

Your text goes here with embedded HTML markup tags that describe the elements of the page and specify where the inline images go. Browsers automatically remove redundant white space from a paragraph and word-wrap the text to fit the width of the browser's display windows.

</BODY>

</HTML>

It's important to understand that it is the markup elements <HEAD> </HEAD> and <BODY></BODY> that divide the page into its two parts and not the carriage returms and line spacing used. There are two kinds of HTML elements: markup tags and characters entities.

Tag syntax

Every markup tag has a tag ID (or name) and possibly some attribute. Markup tags are either empty or nonempty. nonempty tags, also called containers, act upon text enclosed in a pair of starting and ending tags. A starting tag begins with the left angle bracket (<) followed immediately by the tag ID, zero or more attributes separated by spaces, then the right angle bracket (>) to close the tag. Ending tags are exactly the same except there is a slash (/) immediately between the opening left bracket and the tag ID. Ending tags do not contain attributes. Whereas containers modify content, empty tags insert things into the content. The empty tags stand alone; there's no corresponding tag with a slash. Here are some examples of empty tags:

<BR> Line break, following text begins at the left margin.

<HR> Horizontal rule, draw a line across the page.

The following empty tag specifies that an inline image be inserted. It has one attribute, the SRC attribute, whose value is the name (source) of the file containing the image:

Attributes

Attributes take the form of NAME=VALUE, where the value is appropriate to the domain of the attribute. The value should be enclosed in double quotes, although it's safe to drop the quotes when the value is a simple number or constant. If there's more than one attribute in a tag, they are separated by blanks, not commas. some attributes are specified just by the name, for example, BORDER is equivalent to BORDER="yes", which is also the same a BORDER=1.

Anchors

Anchors are tags that define the nodes of hypertext links. Usually, the links are highlighted by the browser by making it blue and underlined to indicate by clicking on it (or selecting it if you're using a nongraphical browser) will link the reader to another page. The attribute value is HREF (Hypertext REFerence).

Structure Tags

HTML tags can be divided into two loose classes -- those that change the page structure and those that change text styles.

Headings

Major divisions of a document are introduced and separated by headings. HTML supports six levels of headings designated by the tag pairs <H1></H1>, <H2></H2>, <H3></H3>, <H4></H4>, <H5></H5>, AND <H6></H6>. This is significant for most hypertext applications, because much of the structure of a hypertext work is in the web of links. Additional structure can be generated by using list and table elements. All heading tags are containers and require a corresponding end tag.

H1 is the highest level of heading. It is customary to put a level 1 heading as the first element in the body of the home page to serve as the internal title of the page, opposed to the window title which is created by the <TITLE></TITLE> tags. A heading element implies a style change, including a paragraph break before and after the heading, and whatever white space is needed to render the heading of that level. Adding style tags to emphasize a heading is neither required or recommended.

Common Attributes

ALIGN is one of a set of common attributes that can be used with headings and most of the other structural markup tags. The ALIGN attribute can have values, "left" (the default), "center", "right" and "justify".

NOWRAP can be specified with most tags to turn off the normal text wrapping. Within the text contained by the tags, line breaks (<BR>) must be used to separate the lines.

The ID attribute can be added to a tag to assign a name to the enclosed content. This name can then be specified as part of an URL thus providing the ability to link to specific points within the body of the document.

The LANG attribute can be used to specify that an alternate language should be applied to the tags contents.

The CLASS attribute is used to assign a class name to the content of the HTML element. Class names are referenced in style sheets for more precise typographic control. The value of the CLASS attribute is a simple name.

The CLEAR attribute is used in conjunction with content that flows around images and figures. It is an instruction to space down the page, as far as necessary, until the margin is clear before placing the content. The CLEAR attribute can have a value of "left", "right" or "all".

Paragaphs

Paragarphs are refered to in HTML as block elements and include plain text paragraphs and some special purpose paragraphs, such as external quotes, side notes, footers, and such. Browsers insert paragraph breaks and extrra line spcing before and after all block elements. In the text contained within a block element, all carriage returns, tabs, controls characters, and redundant blanks are repalaeced with single spaces and the text is word-wrapped to fit the reader's display windows. Your everyday text paragraph is created by paragraph tags, <P></P>. The first line of the paragraph may or may not appear indented -- that's up to the browser. The paragraph is the simplest HTML elememt and may contains any of the common attributes mentioned in the preceding section.

Block Elements

Blockquote

Often some portion of a page's content is material quoted from an external work. To visually distinguish such paragraphs, the blockquote element is used. The tags are written <BLOCKQUOTE></BLOCKQUOTE> and contained text is rendered as a paragraph within wider right and left margins than a plain text paragraph. if more than one paragraph is quoted, it's permissible to use empty paragraph tags to separate paragraphs within the blockquote rather than using separate blockquote tags to enclose each paragraph.

Address

The address element, written <ADDRESS></ADDRESS>, is used for signatures, addresses, and other authorship information usually appearing at the top or bottom of a page. Address text is typically rendered in italic and may be indented or right justified. No more than a single paragraph of text should be in an address block. Use line break tags, <BR>, if you want to layout the address content as separate lines of information.

Note

The note element, written <NOTE></NOTE> , is used for side notes and other extra text material. notes are typically indented or boxed or rendered in a smaller type size, often with an accompanying icon. The role is determined by the ROLE attribute. For example, to indicate a tip in the document the codes would look as follows:

Banner

The banner element, <BANNER></BANNER>, is used to fix a block of text to position relative to the display window, rather than the page. Banner text remains in its position as the rest of the page's content scrolls underneath.

Preformatted Text

The preformatted text block element is soft of an antiparagraph. Any test between the starting and ending tags, <PRE></PRE>, will be essentially left as is -- well almost. Preformatted text is rendered in a monospaced font, and all line breaks and redundant blanks are retained.

Markup Tags

HTML

This element tells your browser that the file contains HTML-coded information. The file extension .html also indicates this an HTML document and must be used. (If you are restricted to 8.3 filenames (e.g., LeeHome.htm, use only .htm for your extension.)

HEAD

The head element identifies the first part of your HTML-coded document that contains the title. The title is shown as part of your browser's window.

TITLE

The title element contains your document title and identifies its content in a global context. The title is displayed somewhere on the browser window (usually at the top), but not within the text area. The title is also what is displayed on someone's hotlist or bookmark list, so choose something descriptive, unique, and relatively short. A title is also used during a WAIS search of a server.

BODY

The second--and largest--part of your HTML document is the body, which contains the content of your document (displayed within the text area of your browser window). The tags explained below are used within the body of your HTML document.

Headings

HTML has six levels of headings, numbered 1 through 6, with 1 being the most prominent. Headings are displayed in larger and/or bolder fonts than normal body text. The first heading in each document should be tagged <H1>.

The syntax of the heading element is:

<Hy>Text of heading </Hy> where y is a number between 1 and 6 specifying the level of the heading.

Do not skip levels of headings in your document. For example, don't start with a level-one heading (<H1§>) and then next use a level-three (<H3>) heading.

Paragraphs

Unlike documents in most word processors, carriage returns in HTML files aren't significant. So you don't have to worry about how long your lines of text are (better to have them fewer than 72 characters long though). Word wrapping can occur at any point in your source file, and multiple spaces are collapsed into a single space by your browser.

Important: You must indicate paragraphs with <P> elements. A browser ignores any indentations or blank lines in the source text. Without <P> elements, the document becomes one large paragraph.

To preserve readability in HTML files, put headings on separate lines, use a blank line or two where it helps identify the start of a new section, and separate paragraphs with blank lines (in addition to the <P> tags). These extra spaces will help you when you edit your files (but your browser will ignore the extra spaces because it has its own set of rules on spacing that do not depend on the spaces you put in your source file).

NOTE: The </P> closing tag can be omitted. This is because browsers understand that when they encounter a <P> tag, it implies that there is an end to the previous paragraph.

Using the <P> and </P> as a paragraph container means that you can center a paragraph by including the ALIGN=alignment attribute in your source file.

Lists

HTML supports unnumbered, numbered, and definition lists. You can nest lists too, but use this feature sparingly because too many nested items can get difficult to follow.

Unnumbered Lists

To make an unnumbered, bulleted list, start with an opening list <UL> (for unnumbered list) tag enter the <LI> (list item) tag followed by the individual item; no closing </LI> tag is needed end the entire list with a closing list </UL> tag

The <LI> items can contain multiple paragraphs. Indicate the paragraphs with the <P> paragraph tags.

Numbered Lists

A numbered list (also called an ordered list, from which the tag name derives) is identical to an unnumbered list, except it uses <OL> instead of <UL>. The items are tagged using the same <LI> tag.

Definition Lists

A definition list (coded as <DL>) usually consists of alternating a definition term (coded as <DT>) and a definition definition (coded as <DD>). Web browsers generally format the definition on a new line.

The <DT> and <DD> entries can contain multiple paragraphs (indicated by <P> paragraph tags), lists, or other definition information.

The COMPACT attribute can be used routinely in case your definition terms are very short. If, for example, you are showing some computer options, the options may fit on the same line as the start of the definition.

Nested Lists

Lists can be nested. You can also have a number of paragraphs, each containing a nested list, in a single list item. For this example, we'll create an ordered list. Enter the first line by typing <OL>. Enter your list items one by one beginning each item with <LI>. When you reach a step that requires a nested list, begin another list. The Web browser will automatically format this new list to fall underneath the current item in the first list. Start entering items in your new list. When you're finished type </UL> to close the list. You must close the new list before continuing to enter items in the orginial list. Enter the remaining items in the list and then press enter and type </OL> when you are done.

Preformatted Text

PRE tag

Use the <PRE> tag (which stands for "preformatted") to generate text in a fixed-width font. This tag also makes spaces, new lines, and tabs significant (multiple spaces are displayed as multiple spaces, and lines break in the same locations as in the source HTML file). This is useful for program listings, among other things.

The <PRE> tag can be used with an optional WIDTH attribute that specifies the maximum number of characters for a line. WIDTH also signals your browser to choose an appropriate font and indentation for the text.

Hyperlinks can be used within <PRE> sections. You should avoid using other HTML tags within <PRE> sections, however.

Note that because <, >, and & have special meanings in HTML, you must use their escape sequences (<;, >;, and &, respectively) to enter these characters.

Extended Quotations

Use the <BLOCKQUOTE> tag to include lengthy quotations in a separate block on the screen. Most browsers generally change the margins for the quotation to separate it from surrounding text.

Addresses

The <ADDRESS> tag is generally used to specify the author of a document, a way to contact the author (e.g., an email address), and a revision date. It is usually the last item in a file.

NOTE: <ADDRESS> is not used for postal addresses.

Forced Line Breaks/Postal Addresses

The <BR> tag forces a line break with no extra (white) space between lines. Using <P> elements for short lines of text such as postal addresses results in unwanted additional white space. For example, with <BR>:

Horizontal Rules

The <HR> tag produces a horizontal line the width of the browser window. A horizontal rule is useful to separate sections of your document. For example, many people add a rule at the end of their text and before the <address> information.

You can vary a rule's size (thickness) and width (the percentage of the window covered by the rule). Experiment with the settings until you are satisfied with the presentation.

Character Formatting

HTML has two types of styles for individual words or sentences: logical and physical. Logical styles tag text according to its meaning, while physical styles indicate the specific appearance of a section.

Logical Styles

<DFN> for a word being defined. Typically displayed in italics. (NCSA Mosaic is a World Wide Web browser.)

<EM> for emphasis. Typically displayed in italics.

<CITE> for titles of books, films, etc. Typically displayed in italics.

<CODE> for computer code. Displayed in a fixed-width font.

<KBD> for user keyboard entry. Typically displayed in plain fixed-width font.

<SAMP> for a sequence of literal characters. Displayed in a fixed-width font.

<STRONG> for strong emphasis. Typically displayed in bold.

<VAR> for a variable, where you will replace the variable with specific information. Typically displayed in italics.

Physical Styles

<B> bold text

<I> italic text

<TT> typewriter text, e.g. fixed-width font.

Escape Sequences (a.k.a. Character Entities)

Character entities have two functions:

escaping special characters
displaying other characters not available in the plain ASCII character set (primarily characters with diacritical marks)

Three ASCII characters--the left angle bracket (&lt), the right angle bracket (&gt), and the ampersand (&amp)--have special meanings in HTML and therefore cannot be used "as is" in text. (The angle brackets are used to indicate the beginning and end of HTML tags, and the ampersand is used to indicate the beginning of an escape sequence.)

To use one of the three characters in an HTML document, you must enter its escape sequence instead:

(ampersand)lt; the escape sequence for >

(ampersand)gt; the escape sequence for >

(ampersand)amp; the escape sequence for &

Linking

The chief power of HTML comes from its ability to link text and/or an image to another document or section of a document. A browser highlights the identified text or image with color and/or underlines to indicate that it is a hypertext link (often shortened to hyperlink or link).

HTML's single hypertext-related tag is <A>, which stands for anchor. To include an anchor in your document: start the anchor with <A (include a space after the A), specify the document you're linking to by entering the parameter HREF="filename" followed by a closing right angle bracket (>), enter the text that will serve as the hypertext link in the current document, enter the ending anchor tag: </A> (no space is needed before the end anchor tag)

Relative Pathnames Versus Absolute Pathnames

You can link to documents in other directories by specifying the relative path from the current document to the linked document. These are called relative links because you are specifying the path to the linked file relative to the location of the current file. You can also use the absolute pathname (the complete URL) of the file, but relative links are more efficient in accessing a server.

Pathnames use the standard UNIX syntax. The UNIX syntax for the parent directory (the directory that contains the current directory) is "..". (For more information consult a beginning UNIX reference text such as Learning the UNIX Operating System from O'Reilly and Associates, Inc.)

Use relative links because it's easier to move a group of documents to another location (because the relative path names will still be valid), it's more efficient connecting to the server, and there is less to type.

Use absolute pathnames when linking to documents that are not directly related. For example, consider a group of documents that comprise a user manual. Links within this group should be relative links. Links to other documents (perhaps a reference to related software) should use full path names. This way if you move the user manual to a different directory, none of the links would have to be updated.

URLs

The World Wide Web uses Uniform Resource Locators (URLs) to specify the location of files on other servers. A URL includes the type of resource being accessed (e.g., Web, gopher, WAIS), the address of the server, and the location of the file. The syntax is:

scheme://host.domain [:port]/path/ filename

where scheme is one of

file: a file on your local system

ftp: a file on an anonymous FTP server

http: a file on a World Wide Web server

gophe: a file on a Gopher server

WAIS: a file on a WAIS server

news: a Usenet newsgroup

telnet: a connection to a Telnet-based service

The port number can generally be omitted. (That means unless someone tells you otherwise, leave it out.)

Links to Specific Sections

Anchors can also be used to move a reader to a particular section in a document (either the same or a different document) rather than to the top, which is the default. This type of an anchor is commonly called a named anchor because to create the links, you insert HTML names within the document.

Inline Images

Most Web browsers can display inline images (that is, images next to text) that are in X Bitmap (XBM), GIF, or JPEG format. Other image formats are being incorporated into Web browsers [e.g., the Portable Network Graphic (PNG) format]. Each image takes time to process and slows down the initial display of a document. Carefully select your images and the number of images in a document.

To include an inline image, enter:

where ImageName is the URL of the image file.

The syntax for <IMG SRC> URLs is identical to that used in an anchor HREF. If the image file is a GIF file, then the filename part of ImageName must end with .gif. Filenames of X Bitmap images must end with .xbm; JPEG image files must end with .jpg or .jpeg; and Portable Network Graphic files must end with .png.

Image Attributes

The image tag has three improtant attributes:

SRC -- The source attribute is mandatory. Its value is the URL of the file containing the image to be embedded. Specify the URL the same way as that of the HREF attribute used in the anchor tag.

ALIGN -- For an inline image, one of the three values: top, middle, or bottom, to define how the image should be aligned with the adjacent text and other HTML elements.

ALT- The ALT attribute is used to specify a text string that can be displayed if the image is not available or the reader has chosen not to load images.

Image Size Attributes

You should include two other attributes on <IMG> tags to tell your browser the size of the images it is downloading with the text. The HEIGHT and WIDTH attributes let your browser set aside the appropriate space (in pixels) for the images as it downloads the rest of the file.

NOTE: Some browsers use the HEIGHT and WIDTH attributes to stretch or shrink an image to fit into the allotted space when the image does not exactly match the attribute numbers. Not all browser developers think stretching/shrinking is a good idea. So don't plan on your readers having access to this feature. Check your dimensions and use the correct ones.

Aligning Images

You have some flexibility when displaying images. You can have images separated from text and aligned to the left or right or centered. Or you can have an image aligned with text. Try several possibilities to see how your information looks best.

Aligning Text with an Image

By default the bottom of an image is aligned with the following text, as shown in this paragraph. You can align images to the top or center of a paragraph using the ALIGN= attributes TOP and CENTER.

Images without Text

To display an image without any associated text (e.g., your organization's logo), make it a separate paragraph. Use the paragraph ALIGN= attribute to center the image or adjust it to the right side of the window.

Alternate Text for Images

Some World Wide Web browsers--primarily those that run on VT100 terminals--cannot display images. Some users turn off image loading even if their software can display images (especially if they are using a modem or have a slow connection). HTML provides a mechanism to tell readers what they are missing on your pages.

The ALT attribute lets you specify text to be displayed instead of an image. For example:

where UpArrow.gif is the picture of an upward pointing arrow. With graphics-capable viewers that have image-loading turned on, you see the up arrow graphic. With a VT100 browser or if image-loading is turned off, the word Up is shown in your window.

You should try to include alternate text for each image you use in your document, which is a courtesy for your readers.

Background Graphics

Newer versions of Web browsers can load an image and use it as a background when displaying a page. Some people like background images and some don't. In general, if you want to include a background, make sure your text can be read easily when displayed on top of the image.

Background images can be a texture (linen finished paper, for example) or an image of an object (a logo possibly). You create the background image as you do any image.

However you only have to create a small piece of the image. Using a feature called tiling, a browser takes the image and repeats it across and down to fill your browser window. In sum you generate one image, and the browser replicates it enough times to fill your window. This action is automatic when you use the background tag shown below.

The tag to include a background image is included in the <BODY> statement as an attribute:

Background Color

By default browsers display text in black on a gray background. However, you can change both elements if you want. Some HTML authors select a background color and coordinate it with a change in the color of the text. Always preview changes like this to make sure your pages are readable.

You change the color of text, links, visited links, and active links using attributes of the <BODY> tag. For example, enter:

This creates a window with a black background (BGCOLOR), white text (TEXT),

and silvery hyperlinks (LINK).

The six-digit number and letter combinations represent colors by giving their RGB (red, green, blue) value. The six digits are actually three two-digit numbers in sequence, representing the amount of red, green, or blue as a hexadecimal value in the range 00-FF. For example, 000000 is black (no color at all), FF0000 is bright red, and FFFFFF is white (fully saturated with all three colors). These number and letter combinations are cryptic. Fortunately online resources are available to help you track down the combinations that map to specific colors:

* ColorPro Web server

* Yahoo's links to documents on backgrounds

External Images, Sounds, and Animations

You may want to have an image open as a separate document when a user activates a link on either a word or a smaller, inline version of the image included in your document. This is called an external image, and it is useful if you do not wish to slow down the loading of the main document with large inline images.

To include a reference to an external image, enter:

<A HREF="MyImage.gif">link anchor</A>

You can also use a smaller image as a link to a larger image. Enter:

The reader sees the SmallImage.gif image and clicks on it to open the LargerImage.gif file.

Use the same syntax for links to external animations and sounds. The only difference is the file extension of the linked file. For example,

<A HREF="AdamsRib.mov">link anchor</A>

specifies a link to a QuickTime movie. Some common file types and their extensions are:

plain text: .txt

HTML document: .html

GIF image: .gif

TIFF image: .tiff

X Bitmap image: .xbm

JPEG image : .jpg or .jpeg

PostScript file: .ps

AIFF sound file: .aiff

AU sound file: .au

WAV sound file: .wav

QuickTime movie: .mov

MPEG movie: .mpeg or .mpg

Keep in mind your intended audience and their access to software. Most UNIX workstations, for instance, cannot view QuickTime movies.

Tables

Before HTML tags for tables were finalized, authors had to carefully format their tabular information within <PRE> tags, counting spaces and previewing their output. Tables are very useful for presentation of tabular information as well as a boon to creative HTML authors who use the table tags to present their regular Web pages.

Think of your tabular information in light of the coding explained below. A table has heads where you explain what the columns/rows include, rows for information, cells for each item. In the following table, the first column contains the header information, each row explains an HTML table tag, and each cell contains a paired tag or an explanation of the tag's function.

Table Elements

Element            Description                                             

 <TABLE>           defines a table in HTML. If the BORDER attribute is     

 </TABLE>          present, your browser displays the table with a         
                   border.                                                 

 <CAPTION>         defines the caption for the title of the table. The     
                   </CAPTION> default position of the title is centered    
                   at the top of the table. The attribute ALIGN=BOTTOM     
                   can be used to position the caption below the table.    
                   NOTE: Any kind of markup tag can be used in the         
                   caption.                                                

 <TR>              </TR> specifies a table row within a table. You may     
                   define default attributes for the entire row: ALIGN     
                   (LEFT, CENTER, RIGHT) and/or VALIGN (TOP, MIDDLE,       
                   BOTTOM).                                                

 <TH>              </TH> defines a table header cell. By default the text  
                   in this cell is bold and centered. Table header cells   
                   may contain other attributes to determine the           
                   characteristics of the cell and/or its contents.        

 <TD>               </TD> defines a table data cell. By default the text   
                   in this cell is aligned left and centered vertically.   
                   Table data cells may contain other attributes to        
                   determine the characteristics of the cell and/or its    
                   contents.

Table Attributes

Attribute                            Description                           

ALIGN (LEFT, CENTER, RIGHT)          Horizontal alignment of a cell.       

VALIGN (TOP, MIDDLE, BOTTOM)         Vertical alignment of a cell.         

COLSPAN=n                            The number (n) of columns a cell      
                                     spans.                                

ROWSPAN=n                            The number (n) of rows a cell spans.  

NOWRAP                               Turn off word wrapping within a       
                                     cell.

Fill-out Forms

Forms

Web forms let a reader return information to a Web server for some action. For example, suppose you collect names and email addresses so you can email some information to people who request it. For each person who enters his or her name and address, you need some information to be sent and the respondent's particulars added to a data base.

This processing of incoming data is usually handled by a script or program written in Perl or another language that manipulates text, files, and information. If you cannot write a program or script for your incoming information, you need to find someone who can do this for you.

The forms themselves are not hard to code. They follow the same constructs as other HTML tags. What could be difficult is the program or script that takes the information submitted in a form and processes it.

Form Tags

A form is a designated area of an HTML page, often rendered with a surrounding border, containing input fields and other interactive objects, such as pop-up menus, checkboxes and buttons. There can be any number of forms on a page each beginning and ending with the tags, <FORM></FORM>. The beginning FORM tag takes an ACTION attribute that specifies what should be done with the information entered by the reader. The ACTION attribute takes an URL as its value which can either be the URL or a cgi script or a "mailto" URL, as in this example:

Note that other HTML elements can be freely used inside and outside the form. The opening FORM tag has two attributes. The first, METHOD, indicates how the form's contents will be presented to the script or e-mailer. Always use the value "post" to specify that the content is presented as standard input.

How to create a simple form

Type <FORM> is your HTML document.
Each <FORM> tag has two important attributes that need to be set: METHOD and ACTION. The METHOD attribute indicates how the information inside the form will be transferred to the Web server. There are two choices GET and POST. The critical difference between the two is that the POST method tells the server to process the form line by line, while the GET method tells the server to process the entire form as one long concatenated string of values. You'll almost always want to use the POST method with your forms.
The ACTION attribute tells the server what to do with the data contained in the form. This attribute usually contains the URL of a special program designed to process the data. Example:
<FORM METHOD=POST ACTION=".../CGI-BIN/PROCESS-
Enter your form labels using normal HTML markup codes. For example, to create a label to prompt the user to enter their last name type <P> LAST NAME:.
To insert a data field to allow the user to enter information into the form, type <INPUT>. This tells the Web browser to place a data field in the document and accept user input. There are several types of input fields available. One of the simplest types is the single-line text field.
To specify a single-line test field, enter TYPE=TEXT inside the INPUT tag.
Each input field needs to be assigned a name, so that it can be distinguished from other input fields. You can name the input field anything you like, but the name should be kept short and should not contains any spaces or special characters. For example
<INPUT TYPE=text NAME="lastname">
You can specify the maximum length of a text field with the size attribute by typing SIZE=, followed by the length in quotes. For a field of 20 characters the syntax would be NAME="lastname" SIZE="20"> inside the <INPUT> tag.
The last two input items that every form should have are the SUBMIT and RESET buttons. The SUBMIT button is pressed by the user when the form is completed, and sends all the information to the server. To include a SUBMIT button in your form, type <INPUT TYPE=SUBMIT> near the bottom of the form.
The RESET button allows the user to clear all of the fields in the form at once and reset them to their initial values so that new information can be added. Although the RESET button is not required it is strongly recommended. To include it in your form, type <INPUT TYPE=RESET>.
Type </FORM> on a new line to close the form.

Form tips

Long forms usually work best when placed in their own HTML documents. If your form requires a lot of input, create a new HTML document just for the form and then create a hyperlink to it from your main page. This will eliminate clutter and confusion.

You're note limited to just input fields in your form. You can use all the normal HTML paragraph and character formatting codes. It's often a good idea to place brief paragraphs in front of groups of input fields to help explain what needs to be entered in the form.

How to Use Input Fields in Forms

Password

You can insert a password filed into your form. This acts like a single-line text field, but hides the input by displaying asterisks (**) in place of the actual characters entered. To insert a password field into your form, type <INPUT NAME="password" TYPE=PASSWORD>. You can specify the maximum length of the password using the SIZE attribute.

Range

Range fields allow the user to select a numeric value that falls between two predetermined maximum and minimum values. These values are set using the MIN and MAX attributes. For example to insert a range field that allows the user to assign a test score value between 0 and 100, type <INPUT NAME="score" TYPE=RANGE MIN=0 MAX=100>.

Checkbox

Checkbox fields allow the user to select or deselect an item. You can also initialize the field to be selected by setting the VALUE attribute to "checked". The label for the checkbox is typed immediately after the <INPUT> tag. For example, you might include a checkbox field on your form to allow users to specify whether or not they'd like to receive a newsletter. To insert this field into your form, type <INPUT NAME="getnews" TYPE=checkbox VALUE="checked">Check here to receive our newsletter.

Radio Buttons

Radio button fields allow the user to make a selection from a group of choices. Only one item can be selected from the radio button group. To insert a radio button group into your form, type <INPUT NAME="groupname" TYPE=radio VALUE="value1">. Each item in the group is entered with separate <INPUT> tags and unique VALUE attributes, but all of the items in the smae radio button group should have the same NAME attribute.

Example:

<P>Please choose one:<BR>

<INPUT NAME="respond" TYPE=radio>Yes

<INPUT NAME="respond" TYPE=radio>No

<INPUT NAME="respond" TYPE=radio>Maybe

</P>

Attachments

You can add file attachments to the form by using the file type. This allows users to attach a file to a form by either typing the file name or selecting it from a browse dialog. To inset a file attachment field type: <INPUT NAME="attachment" TYPE=file>.

Free-form fields

You can also insert a free-form field for text, which allows the user to enter more than just a single line of text. Instead of using the <INPUT> tag, use the <TEXTAREA> and </TEXTAREA> tag pair. The <TEXTAREA> tag accepts several rows of input, up to the maximum you specify with the ROWS attribute. You can also specify the number of columns (the line width) in the TEXTAREA field with the COLS attribute. For example, to create a field to allow a user to enter comments, type <TEXTAREA NAME="comments" ROWS=6 COLS=65>. This would leave room for six lines of up to 65 characters each.

Selection Menus

To include a selection menu on a form which allows a number of choices use the <SELECT> and the </SELECT> tags. You need to assign a NAME attribute for your selection menu. For example to allow a user to select a color you would type <SELECT NAME="color">. If you want to allow multiple selections to be made, insert the attribute MULTIPLE inside the <SELECT> tag. Each item in a selection menu is typed in using the <OPTION> tag. Enter each menu choice on a separate line. When you've finished typing in all the option items type </SELECT>.

Example:

<OPTION>Red

<OPTION>Blue

<OPTION>Green

<OPTION>Purple

</SELECT>

Frames

How to Create Frame Documents

Open a new document in Notepad, and type in <HTML>. Press Enter, then type in <HEAD>. Press Enter Again.
Type <TITLE>My First Frame Document</TITLE>, then press Enter. On the next line, type </HEAD> and press Enter one more time.
So far, this is like a normal HTML document. Here's where things get different, though. Instead of typing <BODY>, type <FRAMESET>. The <FRAMESET> tag instructs Netscape that this is a frame layout document.
Place the cursor inside the <FRAMESET> tag and type ROWS="*,*,*". This creates three horizontal frames of equal relative height. The asterisk character instructs the browser to give the frame all the remaining space in the window. Because there are three asterisks, Netscape will give each frame one-third of the available space.
On the next line, type <FRAME NAME="frame1" SRC="blank.html">. This assigns the name frame1 to the first frame in your document. The SRC attribute tells the browser to display the HTML document named blank.html in this frame. Normally, you would place a real HTML document in the SRC attribute. For this example, we'll just use blank.html, a made-up file name that doesn't really exist. Press Enter when you're finished.
Type <FRAME NAME="frame2" SRC="blank.html"> and then press Enter. One the next line type <FRAME NAME="frame3" SRC="blank.html"> and press Enter again. Now we've created three empty frames.
Type <FRAMESET> and press Enter. Then type </HTML>.
Save your document in Notepad as myframe.html.
Now you have created a very simple frame document that contains three empty frames. Review the finished example below:

<HTML>

<HEAD>

<TITLE>My First Frame Document</TITLE>

</HEAD>

</HTML>

How to Use Targets in Frames

Frames are updated using targets. Targets are simply hyperlink tag extensions that contain a frame name. Targets are specify which frame the hyperlink should update.
Before we go any further, we'll need to create a few HTML documents that contain hyperlinks using targets. Launch Notepad and open a new document. Then type <HTML> an press Enter.
Type <HEAD> and press Enter. Then type <TITLE>Document A</TITLE> and press Enter. Finally, type </HEAD> and press Enter again.
Type <BODY> and press Enter. Then type <H1>Document A</H1> and press Enter.
Type <P> to start your first paragraph. Then type Top Frame: and press Enter.
Here's where we'll start placing hyperlinks with target attributes. These three hyperlinks will allow the user to display different documents in the top frame. Type <A HREF="a.html" TARGET="frame1">A</A>. This link will load a.html (the document you're creating right now) into frame named frame1. In the frame document you created in the previous procedure, frame1 was the top frame.
Press Enter, then type <A HREF="b.html" TARGET="frame1>B</A>. This link will load a document named b.html into the top frame. Press Enter again and then type <A HREF="c.html" TARGET="frame1">C</A>. As you've probably guessed by now, this hyperlink will load c.html into the top frame. Press Enter again.
Type <BR> to force a line break and press enter. Then type Middle Frame: and press Enter again.
Type in all three hyperlinks again, only this time, change the target to frame2. This will instruct the browser to lad the documents into the middle frame.
When you've finished, type <BR>to force another link break and press Enter. Then type Bottom Frame: and press Enter again. Type in the hyperlinks again, with the target set to frame3. When you're finished, press Enter and type </P> to close the paragraph. Then press Enter again.
Type </BODY> and this press Enter. Then type </HTML>.
Save this document as a.html. Make sure that you save it in the same folder as myframe.html, which you created in the last procedure.
Repeat this process tow more times and save the files as b.html and c.html. To save a lot of typing, you can simply change the <TITLE> and <H1> tags at the top of the document and save the existing file under a new name. Just choose Save As from the File menu and type in the new file name.
Open myframe.html in Notepad. Place the cursor inside the SRC attribute of the first <FRAME> tag, and change the URL from blank.html to a.html. Change the URLs for the next two <FRAME> tags to b.html and c.html, respectively.
Choose Save from the File menu to save the changes to myframe.html.
Launch Netscape and open myframe.html. Three frames will appear. Each of your three HTML documents, A,B, and C, will appear in a different frame.

How to Create Nested Frames

Open a new document in Notepad, and type in the lines:

<HTML>

<HEAD>

<TITLE>Nested Frames</TITLE>

</HEAD>

Type <FRAMESET ROWS-"*,*"> to divide the screen into two frames. then press Enter.

Type <FRAME SRC="a.html" NAME=frame1>. This will place the document a.html in the top frame. Now press Enter again.

Instead of inserting another <FRAME> tag, we'll next another <FRAMESET> tag pair, using COLS instead of ROWS. This will have the effect of splitting the bottom frame into two separate frames. Type <FRAMESET COLS="*,*"> and press Enter.

Create frame declarations for the two nested frames. Type <FRAM SRC="b.html" NAME=frame2>. Then press Enter and type <FRAME SRC="c.html" NAME=frame3>. Then press Enter again.

Close the nested <FRAMESET> tag by typing </FRAMESET>, and then press Enter. Then close the first <FRAMESET> tag by typing <FRAMESET> again and pressing Enter. When you're finished, type </HTML>.

Save your document as newframe.html, and place it in the same folder as a.html, b.html, and c.html, which you created earlier.

If you open newframe.html in Netscape, you'll notice that it looks a lot like the first frame document you created, but with one major difference: the bottom two frames are now aligned side by side instead of one on top of the other.

How to Use Non-Standard HTML

There are at least three major versions of HTML currently in use: HTML 2.0, HTML 3.0, and the Netscape extensions to HTML 2.0. HTML 2.0 is the closest thing to current practice that is available, and can be assumed to be "safe" for all browsers.

Two rules to go by:

If two or more popular browsers support an extension, it's probably fine to use.
If an extension is not widely supported, but it will not adversely affect your document if it is missing, it's probably fine to use.

In general, try to think about the effect that the non-standard elements will have if they are not recognized. These elements can be used intelligently, and on browsers that recognize them, can dramatically enhance the presentation of your page. If it is not possible to use the elements in such a way that rendering is still good on all clients, think about providing multiple copies of the and possibly using content-negotiation on the server to provide the reader with the correct version of the document.

HTML 2.0 Extensions

See what's been added or improved for HTML 2.0.

ISINDEX

The ISINDEX now has a PROMPT tag. ISINDEX indicates that a document is a searchable index. PROMPT has been added so the document author can specify what message they want to appear before the text input field of the index. The default is:

This is a searchable index. Enter search keywords:

HR

HR specifies a horizontal rule of some sort (the default being a shaded engraved line) be drawn across the page. There are now 4 new tags which allow the document author some ability to describe how the horizontal rule should look.

<HR SIZE=number> The SIZE tag lets the author give an indication of how thick they wish the horizontal rule to be.

<HR WIDTH=number|percent> The default horizontal rule is always as wide as the page. With the WIDTH tag, the author can specify an exact width in pixels, or a relative width measured in percent of document width.

<HR ALIGN=left|right|center> Now that horizontal rules do not have to be the width of the page we need to allow the author to specify whether they should be pushed up against the left margin, the right margin, or centered in the page.

<HR NOSHADE> Finally, for those times when you really want a solid bar, the NOSHADE tag lets you specify that you do not want any fancy shading of your horizontal rule.

UL

Your basic bulleted list has a default progression of bullet types that changes as you move through indented levels. From a solid disc, to a circle to a square. A TYPE tag is added to the UL element so no matter what your indent level you can specify whether you want a TYPE=disc, TYPE=circle, or TYPE=square as your bullet.

OL

Your average ordered list counts 1, 2, 3, etc. The TYPE tag is also added to this element to allow authors to specify whether they want their list items marked with: capital letters (TYPE=A), small letters (TYPE=a), large roman numerals (TYPE=I), small roman numerals (TYPE=i), or the default numbers (TYPE=1).

For lists that wish to start at values other than 1 there is a new tag START. START is always specified in the default numbers, and will be converted based on TYPE before display. Thus START=5 would display either an 'E', 'e', 'V', 'v', or '5' based on the TYPE tag.

LI

To give even more flexibility to lists, the TYPE tag is added to the LI element as well. It takes the same values as either UL or OL depending on the type of list you are in, and it changes the list type for that item, and all subsequent items. For ordered lists the VALUE element allows you to change the count, for that list item and all subsequent.

IMG

The IMG tag is probably the most extended tag.

The additions to your ALIGN options needs a lot of explanation. First, the values "left" and "right". Images with those alignments are an entirely new floating image type. A ALIGN=left image will float down and over to the left margin (into the next available space there), and subsequent text will wrap around the right hand side of that image. Likewise for ALIGN=right the image aligns with the right margin, and the text wraps around the left.

The rest of the align options try to correct the errors made when first implementing the IMG tag, without destroying the look of existing documents. ALIGN=top does just what it always did, which is align itself with the top of the tallest item in the line. ALIGN=texttop does what many people thought top should do which is align itself with the top of the tallest text in the line (this is usually but not always the same as ALIGN=top). ALIGN=middle does just what it always did, it aligns the baseline of the current line with the middle of the image. ALIGN=absmiddle does what middle should have done which is align the middle of the current line with the middle of the image. ALIGN=baseline aligns the bottom of the image with the baseline of the current line. ALIGN=bottom does just what it always did (which is identical to ALIGN=baseline but baseline is a better name). ALIGN=absbottom does what bottom should have done which is align the bottom of the image with the bottom of the current line.

<IMG WIDTH=value HEIGHT=value> The WIDTH and HEIGHT tags were added to IMG mainly to speed up display of the document. If the author specifies these, the viewer of their document will not have to wait for the image to be loaded over the network and its size calculated.

<IMG BORDER=value> This lets the document author control the thickness of the border around an image displayed. Warning: setting BORDER=0 on images that are also part of anchors may confuse your users as they are used to a colored border indicating an image is an anchor.

<IMG VSPACE=value HSPACE=value> For the floating images it is likely that the author does not want them pressing up against the text wrapped around the image. VSPACE controls the vertical space above and below the image, while HSPACE controls the horizontal space to the left and right of the image.

BR

With the addition of floating images, we needed to expand the BR tag. Normal BR still just inserts a line break. We have added a CLEAR tag to BR, so CLEAR=left will break the line, and move vertically down until you have a clear left margin (no floating images). CLEAR=right does the same for the right margin, and CLEAR=all moves down until

both margins are clear of images.

New Elements

NOBR
WBR
FONT SIZE=value
BASEFONT SIZE=value
CENTER

NOBR

The NOBR element stands for NO BReak. This means all the text between the start and end of the NOBR elements cannot have line breaks inserted between them. While NOBR is essential for those odd character sequences you really don't want broken, please be careful; long text strings inside of NOBR elements can look rather odd.

WBR

The WBR element stands for Word BReak. This is for the very rare case when you have a NOBR section and you know exactly where you want it to break. Also, any time you want to give the Netscape Navigator help by telling it where a word is allowed to be broken. The WBR element does not force a line break (BR does that) it simply lets the Netscape Navigator know where a line break is allowed to be inserted if needed.

FONT SIZE=value

You can change the FONT size. Valid values range from 1-7. The default FONT size is 3. The value given to size can optionally have a '+' or '-' character in front of it to specify that it is relative the the document baseFONT. The default baseFONT is 3, and can be changed with the BASEFONT element.

BASEFONT SIZE=value

This changes the size of the BASEFONT that all relative FONT changes are based on. It defaults to 3, and has a valid range of 1-7.

CENTER

Yes, you can center your text. All lines of text between the begin and end of CENTER are centered between the current left and right margins. A new tag has been introduced rather than using the proposed <P Align="center"> because using <P Align="center"> breaks many existing browsers when the <P> tag is used as a container. The <P Align="center"> tag is also less general and does not support all cases where centering may be desired.

HTML 3.0 Extensions

New Fonts

BIG (big print)

The BIG element specifies that the enclosed text should be displayed, if practical, using a big font (compared with normal text).

SMALL (small print)

The SMALL element specifies that the enclosed text should be displayed, if practical, using a small font (compared with normal text).

<SUB> (subscript)

The SUB element specifies that the enclosed text should be displayed as a subscript, and, if practical, using a smaller font (compared with normal text). The ALIGN attribute for SUB is only meaningful within the MATH element.

<SUP> (superscript)

The SUP element specifies that the enclosed text should be displayed as a superscript, and if practical, using a smaller font (compared with normal text). The ALIGN attribute for SUP is only applicable within the MATH element.